home *** CD-ROM | disk | FTP | other *** search
- UU version 2.1 -- A small, fast, and smart uudecoder
- (C) January 1994 -- ir.drs. B.J. Walbeehm
-
- January 17, 1994
-
- Introduction
- ~~~~~~~~~~~~
- Until further notice, whenever I have a new version of UU, I shall upload it
- to the FTP site wuarchive.wustl.edu (directory: /pub/MSDOS_UPLOADS/uucode), as
- well as to the alt.binaries.pictures.misc and alt.binaries.pictures.utilities
- newsgroups on USENET.
-
- UU is a freeware program; please read the file INFO.TXT for more information
- on what I mean by this. If the file INFO.TXT was not included in your UU
- package, then you can obtain it by e-mailing (which is preferred), writing,
- or calling (which is least preferred) me; the addresses may be found at the
- end of this file. In short, the only thing I ask from you when you decide
- that this program is of use to you, is that you send me an e-mail.
-
- I have written this program primarily for my own convenience; the first time
- I downloaded (a lot of) uuencoded files from the USENET binaries, it took me
- over four hours to edit everything in such a way that the only uudecoder I
- had then (a very naive one) could process them. That was a once-but-never-
- again experience.
-
- Starting with this program, I have broken with my rule to write programs that
- run even on an 8086 based machine. The reason is that (as I said) I write my
- programs first and foremost for myself, and since I "never" use an 8086 ...
- But I can easily convert this program to an 8086 compatible version, and on
- popular demand, I may even be willing to do this. Just let me know if you
- desperately want an 8086 compatible version. For all clarity: UU version 2.1
- requires an 80286 or higher.
-
- I have not yet figured out what the minimal DOS version is that this program
- requires. (I am currently using MS-DOS 6.20, and I do not have versions of
- MS-DOS lying around lower than 5.00.) Anyway, I am quite sure that UU also
- runs on "very low" DOS versions. I learnt that there still are people using
- an 8088 based machine ... are there actually still people using, say, MS-DOS
- 3.00 or below? Or are these versions extinct?
-
- As for memory requirements: The amount of RAM free for executables should be
- at least 65k (UU uses two 28k buffers to speed up reading and writing) for
- this program to work correctly. UU will check if there is enough RAM free,
- and complain if there is not. (I hear some people asking: "65k?" ... Yes,
- I know we are talking .COM here, but that does NOT mean we are restricted to
- 64k now, does it?)
-
- As with all the programs I write, a short usage message is included in UU.
- This message may be displayed by entering either of the following three
- commands:
- UU /?
- UU -?
- TYPE UU.COM
-
- Starting with version 2.0, UU no longer displays a usage message when one
- merely enters "UU". The reason for this is that I think that one should never
- get accustomed to invoking a program without parameters or switches just to
- get help, for there are numerous programs that really do something then. In
- fact, I have written a program ("REMDIR.EXE") that can (depending on whether
- one really wants it to do what it does then) have disastrous effects then.
- What I am trying to say is: Never rely on a program to give you help by
- invoking it without any parameters or switches ...
-
-
- On the uuencoding standard
- ~~~~~~~~~~~~~~~~~~~~~~~~~~
- In my opinion, the uuencoding standard is not very well thought-out. As long
- as an encoded file consists of only one section (in the early days, splitting
- an encoded file up into more than one section was most probably not allowed),
- there is not much wrong with the standard, but as soon as the necessity rose
- for files to be split up, the standard should have been changed as well.
- To start with, there is no standard way of designating non-section parts,
- so the standard provides us with no means whatsoever to distinguish between
- encoded sections and mere comments. Also, the standard does not describe a
- way of deciding which sections belong together, nor in which order. Most
- uuencoders put such additional information in the files, but with the lack
- of a standard, almost every single one of them has its own way of doing this.
- A number of encoders will also put one or more checksums in the file, but
- again, this has not been standardised. It would have been very easy to devise
- a standard for adding such additional information, but it has not been done,
- and it may be far too late now ...
-
-
- Command line parameters and switches
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Although the usage message says "UU [drive:][path]filename[.ext] [/I] [/S]",
- UU allows all kinds of variations on this: Instead of a slash ("/"), a dash
- ("-") is accepted as well. UU of course accepts both uppercase and lowercase,
- and ignores irrelevant blanks (spaces). Also, using a switch twice or more
- has the same effect as using it only once. Moreover, switches (currently, the
- switches are "I" and "S") may be combined, and the order in which the filename
- and the switches (if any) appear on the command line is irrelevant. This means
- that, for instance, all of the following commands are treated identically:
- UU example.uue /I /S
- UU example.uue -I -S
- Uu exAmplE.Uue/s -I
- uu/s example.uue/i
- uu example.uue -is
- uu /is example.uue
- uu example.uue /s/i
- uu/i -sisssis example.uue
-
- Please note that if the dash ("-") is used to precede a switch, it must be
- preceded by at least one blank, since DOS allows dashes also to be part of
- a filename (EXCEPT as the LEADING character of a filename). This means that
- the following two commands are NOT identical:
- uu temp-i
- uu temp/i
-
- The former command processes a file called "temp-i" using no switches, while
- the latter will use the switch "i" on a file called "temp". So if the latter
- interpretation is meant, and one wants to use the dash, then make sure that
- at least one blank precedes it, as in:
- uu temp -i
-
-
- What UU does, and does not do
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Unlike what I have seen in some other uudecoders, UU does NOT assume an
- extension of .UUE if no extension is given. (Let me know if this bothers
- you.) This is for my own convenience, since most of the files I get to
- process have no extension.
-
- The current version of UU does not allow encoded files to be split up into
- different files, so if, for example, a file called EXAMPLE.EXE has been
- converted (by some uuencoder) into three sections, and each section has
- been written to a different file, say EXAMPLE1.UUE, EXAMPLE2.UUE, and
- EXAMPLE3.UUE, then UU will not be able to retrieve the original file.
- Until I have implemented a multiple source handler, one can work around
- this restriction by first executing the following command (still using the
- same example) from the DOS prompt:
- COPY /b EXAMPLE1.UUE + EXAMPLE2.UUE + EXAMPLE3.UUE EXAMPLE.TMP
- and then feeding the resulting combined file EXAMPLE.TMP to UU. Note that the
- /b switch has been added just to be sure; if the source files come directly
- from a (any) uuencoder, then it will not be necessary, but in that case it
- will not harm either. Some posting programs, however, put a CTRL-Z character
- in the file, in which case the /b switch is absolutely required. Please note
- also that if (and only if) the (here) three source files appear in increasing
- order in the directory (so EXAMPLE1.UUE comes before EXAMPLE2.UUE, which in
- turn comes before EXAMPLE3.UUE), that the following DOS command will correctly
- combine them as well:
- COPY /b EXAMPLE*.UUE EXAMPLE.TMP
- The restriction of the files appearing in increasing order in the directory
- when using the latter COPY command does usually not apply when UU is used in
- its "unsorted sections" mode on the resulting file. For more information on
- unsorted sections, see the appropriate chapter in this manual. Please note
- that in order for the COPY command to work correctly, the resulting file
- (EXAMPLE.TMP in the above examples) should have an extension that differs
- from any of the files that are to be concatenated (the files ending in .UUE
- in the above examples).
-
- If no switches are used (and ONLY then), UU does not allow sections to be in
- any other than increasing order in the file. (Please refer to the chapter on
- unsorted sections for information on how to handle these.) In particular,
- this means that this version acts the same as the earlier 1.x versions in
- case no switches are used. In this mode, the 2.x versions are still as fast
- as UU version 1.3 (which is the fastest of the 1.x versions), so even if one
- never dealt with unsorted sections, then the only advantage of using version
- 1.3 would be its smaller size. One of the disadvantages of version 1.3 is
- that it contains a small bug -- due to one (!) erroneous byte, it does not
- allow the INPUT file to have a name of length 1.
-
- UU always allows the source file to contain more than one uuencoded file, and
- each of these files may consist of any number of sections. If no switches are
- used, then these sections MUST be in the correct order. So in this case, a
- file containing the following sections:
- <file 1 part 1>
- <file 1 part 2 (last part)>
- <file 2 part 1>
- <file 2 part 2>
- <file 2 part 3 (last part)>
- will be handled correctly by UU (and result in two files), whereas
- <file 1 part 1>
- <file 2 part 1>
- <file 1 part 2>
- <file 2 part 2>
- <file 2 part 3>
- and
- <file 1 part 2>
- <file 1 part 1>
- <file 2 part 1>
- <file 2 part 3>
- <file 2 part 2>
- will not. Again, this restriction does NOT apply when UU is told that the
- file may contain unsorted sections.
-
- When used in the "sorted order" mode of operation, UU can handle any number of
- sections contained in one input file; there is no limit. The only thing that
- may happen (apart from your hard disk getting full), is that some of the
- numbers that UU displays will not be correct, but this only happens if the
- number of sections in one file exceeds 9999. (Yes, I know I used the number
- 65535 in a previous manual, but that was a mistake. That is what happens when
- you socialise with computers too much.)
-
- If the program terminates or aborts after having detected some error, an
- ERRORLEVEL of 1 is returned; a successful termination results in ERRORLEVEL 0.
-
- Some platforms do not have the restriction of filenames being only at most
- 8+3 characters long, so the filename in the header of the first section of
- an encoded file may not be DOS-compliant. UU recognises this, and prompts
- the user for a new filename.
-
- If the filename for an encoded file already exists, the user is informed of
- this, and may then choose to either overwrite the old file, or rename the new
- one. At this point, CTRL-Break (and CTRL-C) may be used to abort the process.
-
- As opposed to some other uudecoders, UU does not choke on CTRL-Z characters.
-
- UU ignores lines that are not uuencoded, typically before and after sections.
- I saw somewhere that a uudecoder written by someone else could be notified
- that (for example) "---" is not a decodable line, as it seems that this line
- is used as a cut line on several BBS systems. With UU, it is not possible to
- designate such a non-decodable line ... merely because UU does not need that
- information to determine that a given line is not to be treated as a uuencoded
- line. UU uses four ways to determine whether a line is a mere comment or not,
- and treats the line as an encoded line only if all four ways show it is not a
- comment. These tests are partly performed simultaneously, and always in such a
- way as to require hardly any additional time (e.g. when the data required for
- a test is available due to some other action currently being performed).
-
- Although UU is quite intelligent, it is possible to fool it, but I think that
- this is purely academic, for the chances of it being fooled are astronomically
- small (unless someone intentionally fooled UU). Even if one decoded hundreds
- of thousands of uuencoded files, it would most probably occur not even once
- that UU was fooled. And if it should ever occur that UU is fooled, then,
- please, do not blame UU or me, but blame the one who invented the uuencoding
- standard for not making it more strict. Or, put in another way: All uudecoders
- can be fooled, but mine must be one of the most reliable ones as I can easily
- show by a simple computation of probabilities. Of course, UU cannot perform
- miracles, so if the uuencoded file is corrupt to begin with, UU will be
- helpless too.
-
-
- Handling unsorted sections
- ~~~~~~~~~~~~~~~~~~~~~~~~~~
- UU can also handle files containing randomly ordered sections. For this mode
- of operation, two switches are available: /I and /S. When invoked with /I only,
- UU will scan the source file, and it will subsequently report what it has found
- there, but it will not actually decode anything. When invoked with both /I and
- /S (or any equivalent notation -- see the chapter on command line parameters
- and switches), it WILL start decoding after having reported the information.
- A less verbose, but equally efficient result is obtained by specifying only
- the /S switch.
-
- Although there is a maximum to the number of sections that UU can handle using
- this "unsorted sections" mode of operation, this can hardly be considered a
- restriction, since this maximum number is 434.
-
- This mode of operation, however still very fast, is slower than the "sorted
- order" mode. Just how much slower depends on the order in which the sections
- appear. Worst case performance (in terms of speed) is when the sections appear
- in reversed order; considerable gains may be achieved on systems using disk
- caches and/or RAM drives.
-
- Since the "sorted order" mode uses one very powerful assumption (viz. the
- sections being in sorted order), whereas the "unsorted sections" mode can (at
- best) only rely on whatever information it filters out of the source file, it
- is possible for UU to obtain better results in the former mode. So I recommend
- using the "sorted order" mode whenever one is sure that every section appears
- in the correct order (which, as noted earlier, also is faster).
-
- So how does UU obtain its information? The current version of UU recognises
- more than fifteen different uuencoders and posting programs. (For the ease of
- discussion, I shall use the term "uuencoders" when I mean "uuencoders and/or
- posting programs" in the remainder of this manual.) As far as I know, these
- mostly are uuencoders used on PCs and UNIX systems, but I'd rather wait with
- listing the uuencoders it recognises until I have found out which ones most of
- them are.
-
- If it cannot recognise the uuencoders that were used, or if these have not
- included all of the necessary information in the file, UU tries to use the
- "Subject:" lines (if it finds any) that may be included if the file contains
- postings from USENET. Instead of "Subject:" lines, some newsreaders produce
- "Description:" lines; these are also supported by UU. In the remainder of this
- manual, I shall no longer refer to "Description:" lines, but whatever holds
- for "Subject:" lines, also applies to "Description:" lines.
-
- If postings from USENET are used, I recommend NOT chopping off the headers
- (and thus the "Subject:" lines) for a higher chance of success. "Subject:"
- lines are used only if all else fails, because of the higher chance of these
- containing errors. For instance, someone may have erroneously given a five part
- file a subject line of "EXAMPLE.ZIP (4/6)" indicating that there are six parts.
- But even when things like this happen, there is a good chance that UU will
- successfully decode these files all the same. To end this subject (no pun
- intended), some examples of "Subject:" lines, and how they will be processed
- by UU:
- - Subject: EXAMPLE.ZIP (4/6)
- UU sees this as part four of a six part file called EXAMPLE.ZIP.
- - Subject: PICTURE.GIF {Just another picture} [01/10]
- As expected, UU will see this as part one of a ten part file called
- PICTURE.GIF.
- - Subject: Repost:AGAIN.EXE(Part3of20).Reposted on popular demand.
- Yes, UU will assume it is dealing with part three of a twenty part file
- called AGAIN.EXE.
- - Subject: >FOOBAR.JPG (b/w) {Another picture} (part 3/5.
- UU is not fooled by "(b/w)", nor by the ">"; it will correctly assume
- this is part three of a five part file called FOOBAR.JPG.
- - Subject: - FooBar.Jpg {Another picture /0 } part04 of5} (6 /w ).
- Even this does not fool UU; it assumes to be dealing with part four of a
- five part file called FooBar.Jpg. Moreover, UU will see this as a further
- part of the same file as in the previous example.
- Although these examples show that UU is quite "intelligent" while dealing with
- these lines, I realise that my "Subject:" line parser still leaves room for
- improvement. Either way, the name it finds in the "Subject:" line is not all
- that important since the name of the file also appears in the header of the
- first section of a uuencoded file. And most of the time (so even when it comes
- up with false information from the "Subject:" line), it will yield a correct
- result anyway.
-
- And while on the subject of filenames: Most of the uuencoders also include
- the filename at the start of each (so not only the first) section, one way or
- another. For at least some of them, it may be the case that this name differs
- from the one that is in the header of the first section. And of course, this is
- also possible for the name UU filters out of the "Subject:" line. That is why,
- when using the /I switch, UU will give two names for each section it finds.
- The real name (i.e. the one from the header of the first section) is the one
- that is NOT parenthesised. And although UU will display the names exactly as
- they appear in the file, it will perform a case-insensitive comparison between
- these names, thus making up for capitalisation inconsistencies by the person
- who posted the file.
-
- Also when using the /I switch, UU will give the section number and the total
- number of sections for each section (as far as this could be determined of
- course). This is displayed as in "(003/010)", which which would mean that
- this section is part three of a ten part file. Whenever a number could not be
- determined, "000" is printed instead. Finally (still when using the /I switch
- only), UU displays some information on any section it will not be able to
- process, as well as the reason for this.
-
- The remainder of this chapter holds for both the /I and /S switches: Whenever
- a filename that was encountered is longer than twelve characters, it will be
- displayed to the first eleven characters only, with an asterisk (*) appended
- to it. Of course, the full name will be displayed when prompting the user for
- a new filename.
-
- When UU has scanned the input file, it will list the names, and numbers of
- sections of each COMPLETE file it has found. It also gives the total number
- of sections it has found, the number of sections it could not identify, and
- the number of sections that may be processed. Note that the latter number is
- not necessarily the difference of the former two, because there are various
- reasons that a section that WAS identified cannot be processed after all (for
- example when there are other sections of the same file missing). The actual
- reason will usually be given while using the /I switch.
-
- I have done my very best to make UU as smart as possible, but as noted earlier,
- due to the fact that the uuencoding standard is not strict enough, even the
- most intelligent uudecoder may not be able to correctly figure everything out.
- Let me end this chapter by quoting Nick Viner: "Of course some files which have
- been split by hand and not labelled adequately will always defeat it!"
-
-
- Plans for future versions of UU
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- With the exception of the first two points, the following plans do not have
- high priority for me ... but I am open to suggestions, so if you have any
- arguments in favour of any of these, or perhaps some new suggestions, please
- let me know.
-
- I can think of several ways of making UU even smarter. For instance, by adding
- support for even more uuencoders (if I find any). Another option is to have UU
- use the information it gathers but does not use so far, so as to have it make
- its own assumptions about sections that could only be partially identified.
- The latter case would then be as if UU said "These sections probably belong
- together ... well, let's assume they do, and process them.". Finally, the
- routine that deals with USENET's "Subject:" lines could be made yet a little
- smarter.
-
- Another option I plan to add, is to have UU be able to write every section
- that has not been processed to a separate file. Related to this would be an
- option to have UU output all non-encoded data.
-
- I am considering having UU be able to handle files whose sections are not
- all contained in one and the same file, so the PART1.UUE PART2.UUE PART3.UUE
- scheme, but I should add that this does not have high priority, since I only
- need this very rarely, and for these rare cases, I do not mind using the COPY
- command first in order to put everything in one file.
-
- As an alternative to the former, or even in addition to it, I may some day
- have UU accept wildcards in the filename.
-
- I am considering adding a switch (/d for instance) allowing one to have the
- input file deleted after it has been SUCCESSFULLY processed. Again, this does
- not have high priority for me, but on the other hand, it would be very easy
- to add this. So anyone in favour of this is kindly requested to react. (People
- who are against this option do not have to react I guess, because no one is
- forced to actually use all UU's options.)
-
- Some uuencoders put checksums in the files. I may have a future version of UU
- be able to check these.
-
- I may also write an also very fast, and even smaller uuencoder.
-
- I may add a third option to UU in case a file already exists, viz. "skip",
- which will allow the user to choose not to process this file, and continue
- with the next (if any).
-
- I may also add support for xxencoded files to UU.
-
- Someone suggested it would be nice if one could change UU's defaults, so that,
- for example, the /S switch would then be assumed automatically. I do not like
- to do this, since it would make using UU less easy. I think that naive users
- would be frightened by the prospect of having to edit some configuration file
- (or something like that) first. Moreover, I think typing "UU/S" instead of
- "UU" cannot be a real bother. Or stated differently: If I had given this
- program a longer name, then those extra characters would have to be entered
- anyway.
-
-
- Acknowledgements
- ~~~~~~~~~~~~~~~~
- I should like to thank the following persons:
- - Terry O'Brien for sending me detailed information on the file mode code
- in the header of uuencoded files, and on uuencoding in general.
- - Martin (sorry, don't know your last name) from Nottingham (?) for telling
- me about the bug :-( in version 1.1 (and 1.0).
- - Brian Norris for telling me about the bug :-( in version 1.3 (and earlier
- versions).
- - Douglas Swiggum for all the trouble taken in sending me "strange" uuencoded
- files, and detailed descriptions of what happened. You have saved me a lot
- of time in finding two bugs :-( in version 2.0!
- Last but not least, I should like to thank all the people who have let me
- know they appreciate my program, or otherwise (e.g. by telling me about bugs)
- mailed me regarding UU.
-
-
- Release history
- ~~~~~~~~~~~~~~~
- In my convention of version numbers, 0.x versions denote usually unreleased
- prototype versions.
-
- Versions 0.1 through 0.4, and 0.6 were private, unreleased versions, written
- in a mixture of Pascal and Assembly-language.
- Version 0.5 was given to but a few people to see how they liked it. It had
- resulted from a process of stepwise refinement in which speed, size, feedback,
- and user-friendliness were tackled. Versions 0.1 through 0.5 were all written
- on 11-Dec-93. They were EXE files, and the latter had a size of 5872 bytes.
-
- UU 0.6 Type: EXE Size: 3424 Date: 14-Dec-93
- The last prototype version. Most of it written in assembly. Yet a bit
- faster than 0.5.
-
- UU 1.0 Type: COM Size: 1993 Date: 15-Dec-93
- The first publicly released version. But for some tiny details this is
- the full-assembly version of 0.6.
-
- UU 1.1 Type: COM Size: 1965 Date: 18-Dec-93
- Even smarter in distinguishing comment lines from encoded lines (a fourth
- test has been added). Sections containing only one non-empty line are now
- recognised as such. Detects when the disk is full, upon which it aborts
- with an appropriate message. Yet a bit faster than 1.0.
-
- UU 1.2 Type: COM Size: 1896 Date: 23-Dec-93
- Now really only accepts "y", "Y", "n", and "N" while asking permission to
- overwrite an existing file. Also, CTRL-Break (and CTRL-C) can be used at
- this point to abort the program immediately.
-
- UU 1.3 Type: COM Size: 1892 Date: 25-Dec-93
- In earlier versions, lines of more than 255 characters COULD (although it
- is HIGHLY improbable they actually WOULD) result in decoded files being
- corrupted; starting with this version, this can no longer happen. Yet a
- bit faster than 1.2 (amongst others (but not only!) because the read and
- write buffers now each are 4k larger).
-
- UU 2.0 Type: COM Size: 5866 Date: 09-Jan-94
- Now also allows files containing unsorted sections. An intelligent command
- line parser has been added. Because of this, the bug of UU not accepting
- filenames of length 1 in the command line (in fact, I did not even know
- about this bug until some time after I had finished the parsing routines)
- no longer exists. Aborts with an appropriate message if there is not enough
- (conventional) RAM free. Displays an error message if invoked without any
- parameters or switches.
-
- UU 2.1 Type: COM Size: 6257 Date: 17-Jan-94
- I really thought I had solved the problem of lines containing more than
- 255 characters in version 1.3, but I had not; now, it is REALLY fixed.
- Added support for five more uuencoders and posting programs, as well as for
- "Description:" lines. Made the parser for "Subject:" (and "Description:")
- lines even more intelligent. Fixed a bug that seemed to matter only when
- run from the DOS box under Windows. The maximum number of unsorted sections
- UU can handle is slightly higher. Some minor changes not worth mentioning.
-
-
- Contacting the author <-- Hey, that's me! :-)
- ~~~~~~~~~~~~~~~~~~~~~
- Contact me (preferably using e-mail) if you have any questions, suggestions,
- remarks, etc., on this document, on UU, or on any other of my programs.
- Also, if you find a valid uuencoded file that UU does not process correctly,
- please let me know. And if at all possible, pray send that file along to me
- (or otherwise a detailed description of its contents), preferably in some
- (any) compressed form in order to keep my mail server from automagically
- ruining it. Beyond my control, my mail server automatically decodes (or tries
- to anyway) uuencoded files, so I would not end up with your uuencoded file.
- Thank you very much!
-
- I check the alt.binaries.pictures.misc and alt.binaries.pictures.utilities
- newsgroups on USENET regularly, so you could also try placing messages for
- me there. Finally, please send me an e-mail if you think my program is of
- use to you (or flame me if you think it is useless). If I do not get enough
- feedback, I take it that people are not interested, and I shall ... continue
- writing programs for myself, but DIScontinue spreading them on anything but
- a very small scale.
-
- Ben Jos Walbeehm (Please get my first name right, it is "Ben Jos".)
- Lijsterbeslaan 20
- 5248 BB Rosmalen
- The Netherlands
- Phone : +31 4192 14345 (The best time (GMT) to get hold of me is at night!)
- E-mail: Walbeehm@fsw.ruu.nl
-